Learning to Rank Answers to Why-Questions
نویسندگان
چکیده
The goal of the current research project is to develop a ques tion answering system for answering why-questions (why QA). Our system is a pipeline consisting of an off-the-shelf retrieval module followed by an answer re-ranking module. In this paper, we aim at improving the ranking performance of our system by finding the optimal approach to learning to rank. More specifically, we try to find the optimal ranking function to be applied to the set of candidate answers in the re-ranking module. We experiment with a number of ma chine learning algorithms (i.e. genetic algorithms, logistic regression and SVM), with different cost functions. We find that a learning to rank approach using either a re gression technique or a genetic algorithm that optimizes for MRR leads to a significant improvement over the TF-IDF baseline. We reach an MRR of 0.341 with a success@10 score of 58.82%. We also see that, as opposed to logistic re gression and genetic algorithms, SVM is not suitable for the current data representation. After extensive experiments with SVMs, we still reach scores that are below baseline. In future work, we will investigate in more detail the limi tations of our re-ranking approach: which set of questions cannot be answered in the current system set-up and why?
منابع مشابه
Learning to Rank QA Data Evaluating Machine Learning Techniques for Ranking Answers to Why-Questions
In this work, we evaluate a number of machine learning techniques for the purpose of ranking answers to why-questions. We use a set of 37 linguistically motivated features that characterize questions and answers. We experiment with a number of machine learning techniques in various settings. The purpose of the experiments is to assess how the different machine learning techniques can cope with ...
متن کاملQU-IR at SemEval 2016 Task 3: Learning to Rank on Arabic Community Question Answering Forums with Word Embedding
Resorting to community question answering (CQA) websites for finding answers has gained momentum in the past decade with the explosive rate at which social media has been proliferating. With many questions left unanswered on those websites, automatic and smart question answering systems have seen light. One of the main objectives of such systems is to harness the plethora of existing answered q...
متن کاملCorpus-based Question Answering for why-Questions
This paper proposes a corpus-based approach for answering why-questions. Conventional systems use hand-crafted patterns to extract and evaluate answer candidates. However, such hand-crafted patterns are likely to have low coverage of causal expressions, and it is also difficult to assign suitable weights to the patterns by hand. In our approach, causal expressions are automatically collected fr...
متن کاملLearning to Rank Answers on Large Online QA Collections
This work describes an answer ranking engine for non-factoid questions built using a large online community-generated question-answer collection (Yahoo! Answers). We show how such collections may be used to effectively set up large supervised learning experiments. Furthermore we investigate a wide range of feature types, some exploiting NLP processors, and demonstrate that using them in combina...
متن کاملLearning to Rank Effective Paraphrases from Query Logs for Community Question Answering
We present a novel method for ranking query paraphrases for effective search in community question answering (cQA). The method uses query logs from Yahoo! Search and Yahoo! Answers for automatically extracting a corpus of paraphrases of queries and questions using the query-question click history. Elements of this corpus are automatically ranked according to recall and mean reciprocal rank, and...
متن کامل